Subset Seed Automaton

نویسندگان

  • Gregory Kucherov
  • Laurent Noé
  • Mikhail A. Roytberg
چکیده

We study the pattern matching automaton introduced in [1] for the purpose of seed-based similarity search. We show that our definition provides a compact automaton, much smaller than the one obtained by applying the Aho-Corasick construction. We study properties of this automaton and present an efficient implementation of the automaton construction. We also present some experimental results and show that this automaton can be successfully applied to more general situations.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

2 7 Ja n 20 06 A unifying framework for seed sensitivity and its application to subset seeds ( Extended abstract )

We propose a general approach to compute the seed sensitivity, that can be applied to different definitions of seeds. It treats separately three components of the seed sensitivity problem – a set of target alignments, an associated probability distribution, and a seed model – that are specified by distinct finite automata. The approach is then applied to a new concept of subset seeds for which ...

متن کامل

in ri a - 00 00 11 64 , v er si on 1 - 2 4 M ar 2 00 6 A unifying framework for seed sensitivity and its application to subset seeds ( Extended abstract )

We propose a general approach to compute the seed sensitivity, that can be applied to different definitions of seeds. It treats separately three components of the seed sensitivity problem – a set of target alignments, an associated probability distribution, and a seed model – that are specified by distinct finite automata. The approach is then applied to a new concept of subset seeds for which ...

متن کامل

A unifying framework for seed sensitivity and its application to subset seeds (Extended abstract)

We propose a general approach to compute the seed sensitivity, that can be applied to di erent de nitions of seeds. It treats separately three components of the seed sensitivity problem { a set of target alignments, an associated probability distribution, and a seed model { that are speci ed by distinct nite automata. The approach is then applied to a new concept of subset seeds for which we pr...

متن کامل

A Unifying Framework for Seed Sensitivity and Its Application to Subset Seeds

We propose a general approach to compute the seed sensitivity, that can be applied to different definitions of seeds. It treats separately three components of the seed sensitivity problem--a set of target alignments, an associated probability distribution, and a seed model--that are specified by distinct finite automata. The approach is then applied to a new concept of subset seeds for which we...

متن کامل

in ri a - 00 17 04 14 , v er si on 1 - 7 S ep 2 00 7 Subset seed automaton

We study the pattern matching automaton introduced in [1] for the purpose of seed-based similarity search. We show that our definition provides a compact automaton, much smaller than the one obtained by applying the Aho-Corasick construction. We study properties of this automaton and present an efficient implementation of the automaton construction. We also present some experimental results and...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007